创造像音乐这样的复杂艺术作品需要深刻的创造力。随着深度学习和强大模型(例如变形金刚)的最新进展,自动音乐生成取得了巨大进展。在伴奏的生成环境中,在歌曲中的适当位置创建一个连贯的鼓模式,即使对于经验丰富的鼓手来说,在歌曲中的适当位置也是一项艰巨的任务。鼓节拍倾向于通过填充或即兴表演的节遵循重复的模式。在这项工作中,我们解决了鼓模式产生的任务,该任务是根据四种旋律乐器演奏的音乐来解决的:钢琴,吉他,贝斯和弦乐。我们将变压器序列用于序列模型来生成在旋律伴奏下进行的基本鼓模式,以发现即兴创作在很大程度上不存在,这可能归因于其在训练数据中的预期相对较低的表示。我们提出了一种新颖的功能,以捕获相对于其邻居的标准中即兴创作的程度。我们训练一个模型,以预测旋律伴奏曲目的即兴位置。最后,我们使用一种小说的伯特(Bert)启发的填充体系结构,以学习鼓和旋律的结构,以实现即兴音乐的填充元素。
translated by 谷歌翻译
我们听到的每种声音都是连续的卷积操作的结果(例如,室内声学,麦克风特性,仪器本身的共振特性,更不用说声音复制系统的特征和局限性了)。在这项工作中,我们试图确定使用AI执行特定作品的最佳空间。此外,我们使用房间声学作为增强给定声音的感知品质的一种方式。从历史上看,房间(尤其是教堂和音乐厅)旨在主持和提供特定的音乐功能。在某些情况下,建筑声学品质增强了那里的音乐。我们试图通过指定房间冲动响应来模仿这一步骤,这些响应与为特定音乐产生增强的声音质量相关。首先,对卷积架构进行了培训,可以采用音频样本,并模仿各种仪器家族准确性约78%的专家的评分,并具有感知品质的笔记。这为我们提供了任何音频样本的评分功能,可以自动评分音符的感知愉悦度。现在,通过一个大约有60,000个合成冲动响应的库,模仿了各种房间,材料等,我们使用简单的卷积操作来改变声音,就好像它在特定的房间里播放一样。感知评估者用于对音乐声音进行排名,并产生“最佳房间或音乐厅”来播放声音。作为副产品,它还可以使用房间声学将质量差的声音变成“好”声音。
translated by 谷歌翻译
对音频信号的长期依赖性进行建模是一个特别具有挑战性的问题,因为即使是小型尺度的产量,也要在十万个样本上产生。随着变形金刚的最近出现,神经体系结构擅长于更长的时间尺度建模依赖性,但它们受到二次限制的限制来扩展它们。我们提出了一种生成的自动回归体系结构,该体系结构可以在相当大的上下文中对音频波形进行建模,超过500,000个样本。我们的工作适应了通过CNN前端学习潜在表示,然后使用变压器编码器,经过全面训练的端到端学习来学习时间依赖性:从而允许它认为适合于该表示的表示形式。下一个样本。与以前的作品比较了不同的时间量表以显示改进,我们使用标准数据集,具有相同数量的参数/上下文来显示改进。与其他方法相比,我们在标准数据集中实现了最先进的性能,例如WaveNet,Sashmi和Sample-RNN,用于建模长期结构。这项工作为该领域提供了非常令人兴奋的方向,鉴于上下文建模的改进,可以通过使用数十亿/万亿个参数来缩放更多数据,并可能更好地结果。
translated by 谷歌翻译
本文提供了一种在没有传统艺术神经结构状态的情况下做大规模音频理解的方式。自从在过去十年中引入理解音频信号的深度学习以来,卷积架构已经能够实现最先进的艺术状态,这些结果超越了传统的手工制作功能。在最近的过去,远离传统的卷积和经常性神经网络相似的转变,朝纯端到端的变压器架构。在这项工作中,我们探索了一种基于袋式模型的方法。我们的方法没有任何卷积,复发,关注,变压器或其他方法如伯特。我们利用Micro和Macro Level群集Vanilla Embeddings,并使用MLP头进行分类。我们仅使用前馈编码器解码器模型来获取光谱包围,光谱贴片和切片以及多分辨率光谱的瓶颈。类似于SIMCLR中的方法的分类头(前馈层)在学习的表示上培训。使用简单的代码在潜在的陈述中了解到,我们展示了我们如何超越传统的卷积神经网络架构,并引人注目地接近优势强大的变压器架构。这项工作希望能够在没有大规模的端到端神经架构的情况下铺平道具的兴奋进步。
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
In multi-agent systems with large number of agents, typically the contribution of each agent to the value of other agents is minimal (e.g., aggregation systems such as Uber, Deliveroo). In this paper, we consider such multi-agent systems where each agent is self-interested and takes a sequence of decisions and represent them as a Stochastic Non-atomic Congestion Game (SNCG). We derive key properties for equilibrium solutions in SNCG model with non-atomic and also nearly non-atomic agents. With those key equilibrium properties, we provide a novel Multi-Agent Reinforcement Learning (MARL) mechanism that minimizes variance across values of agents in the same state. To demonstrate the utility of this new mechanism, we provide detailed results on a real-world taxi dataset and also a generic simulator for aggregation systems. We show that our approach reduces the variance in revenues earned by taxi drivers, while still providing higher joint revenues than leading approaches.
translated by 谷歌翻译
Dense prediction tasks such as segmentation and detection of pathological entities hold crucial clinical value in the digital pathology workflow. However, obtaining dense annotations on large cohorts is usually tedious and expensive. Contrastive learning (CL) is thus often employed to leverage large volumes of unlabeled data to pre-train the backbone network. To boost CL for dense prediction, some studies have proposed variations of dense matching objectives in pre-training. However, our analysis shows that employing existing dense matching strategies on histopathology images enforces invariance among incorrect pairs of dense features and, thus, is imprecise. To address this, we propose a precise location-based matching mechanism that utilizes the overlapping information between geometric transformations to precisely match regions in two augmentations. Extensive experiments on two pretraining datasets (TCGA-BRCA, NCT-CRC-HE) and three downstream datasets (GlaS, CRAG, BCSS) highlight the superiority of our method in semantic and instance segmentation tasks. Our method outperforms previous dense matching methods by up to 7.2 % in average precision for detection and 5.6 % in average precision for instance segmentation tasks. Additionally, by using our matching mechanism in the three popular contrastive learning frameworks, MoCo-v2, VICRegL and ConCL, the average precision in detection is improved by 0.7 % to 5.2 % and the average precision in segmentation is improved by 0.7 % to 4.0 %, demonstrating its generalizability.
translated by 谷歌翻译
This thesis considers sequential decision problems, where the loss/reward incurred by selecting an action may not be inferred from observed feedback. A major part of this thesis focuses on the unsupervised sequential selection problem, where one can not infer the loss incurred for selecting an action from observed feedback. We also introduce a new setup named Censored Semi Bandits, where the loss incurred for selecting an action can be observed under certain conditions. Finally, we study the channel selection problem in the communication networks, where the reward for an action is only observed when no other player selects that action to play in the round. These problems find applications in many fields like healthcare, crowd-sourcing, security, adaptive resource allocation, among many others. This thesis aims to address the above-described sequential decision problems by exploiting specific structures these problems exhibit. We develop provably optimal algorithms for each of these setups with weak feedback and validate their empirical performance on different problem instances derived from synthetic and real datasets.
translated by 谷歌翻译
This thesis considers sequential decision problems, where the loss/reward incurred by selecting an action may not be inferred from observed feedback. A major part of this thesis focuses on the unsupervised sequential selection problem, where one can not infer the loss incurred for selecting an action from observed feedback. We also introduce a new setup named Censored Semi Bandits, where the loss incurred for selecting an action can be observed under certain conditions. Finally, we study the channel selection problem in the communication networks, where the reward for an action is only observed when no other player selects that action to play in the round. These problems find applications in many fields like healthcare, crowd-sourcing, security, adaptive resource allocation, among many others. This thesis aims to address the above-described sequential decision problems by exploiting specific structures these problems exhibit. We develop provably optimal algorithms for each of these setups with weak feedback and validate their empirical performance on different problem instances derived from synthetic and real datasets.
translated by 谷歌翻译
In molecular research, simulation \& design of molecules are key areas with significant implications for drug development, material science, and other fields. Current classical computational power falls inadequate to simulate any more than small molecules, let alone protein chains on hundreds of peptide. Therefore these experiment are done physically in wet-lab, but it takes a lot of time \& not possible to examine every molecule due to the size of the search area, tens of billions of dollars are spent every year in these research experiments. Molecule simulation \& design has lately advanced significantly by machine learning models, A fresh perspective on the issue of chemical synthesis is provided by deep generative models for graph-structured data. By optimising differentiable models that produce molecular graphs directly, it is feasible to avoid costly search techniques in the discrete and huge space of chemical structures. But these models also suffer from computational limitations when dimensions become huge and consume huge amount of resources. Quantum Generative machine learning in recent years have shown some empirical results promising significant advantages over classical counterparts.
translated by 谷歌翻译